Illumina NovaSeq 6000 paired end sequencing - SRA

ERX3266987: Illumina NovaSeq 6000 paired end sequencing
25 ILLUMINA (Illumina NovaSeq 6000) runs: 1.3G spots, 380.2G bases, 87.8Gb downloads

Submitted by: NYGC

Study: 30X whole genome sequencing coverage of the 2504 Phase 3 1000 Genome samples.

PRJEB31736 • ERP114329 • All experiments • All runs

show Abstracthide Abstract

We sequenced all 2,504 samples from the 1000 Genomes (1KG) Project to a minimum of 30x mean genome coverage. Though a small number of 1KG samples had been sequenced to high coverage previously, we sequenced all samples to depth on the latest technology, providing a unified dataset for the next phase of analyses. We processed these samples using the laboratory processes we have previously used for the CCDG project (with minor modifications). Specifically, we generated PCR-free sequencing libraries using unique dual indices to avoid the index switching phenomenon that occurs and causes low level sequencing data contamination on the Illumina patterned flow cells. We sequenced these samples on the Illumina NovaSeq 6000 sequencing instrument, with 2x150bp reads. We believe this instrument represents the future for WGS with short-read technology, and it was important to sequence the 1KG samples in a format that is consistent with future large scale sequencing projects. Our automated analysis pipeline for whole genome sequencing matches the CCDG and TOPMed recommended best practices. Sequencing reads were aligned to the human reference, hs38DH, using BWA-MEM v0.7.15. Data are further processed using the GATK best-practices (v3.5), which generates VCF files in the 4.2 format. Single nucleotide variants and Indels are called using GATK HaplotypeCaller (v3.5), which generates a single-sample GVCF. Variant Quality Score Recalibration (VQSR) is performed using dbSNP138 so quality metrics for each variant can be used in downstream variant filtering.

Sample: Coriell GM19084

SAMN00000541 • SRS000771 • All experiments • All runs

Organism: Homo sapiens

Library:

Name: NA19084

Instrument: Illumina NovaSeq 6000

Strategy: WGS

Source: GENOMIC

Selection: RANDOM

Layout: PAIRED

Construction protocol: TruSeq DNA PCR-free

Runs: 25 runs, 1.3G spots, 380.2G bases, 87.8Gb

Run	# of Spots	# of Bases	Size	Published
ERR3239612	422,392,359	126.7G	11.8Gb	2019-03-25
ERR3961111	38,477,614	11.5G	3.5Gb	2020-03-03
ERR3961112	37,613,737	11.3G	3.4Gb	2020-03-03
ERR3961113	38,105,758	11.4G	3.4Gb	2020-03-03
ERR3961114	37,654,298	11.3G	3.4Gb	2020-03-03
ERR3961115	37,409,672	11.2G	3.4Gb	2020-03-03
ERR3961116	30,244,685	9.1G	2.8Gb	2020-03-03
ERR3961117	38,540,689	11.6G	3.5Gb	2020-03-03
ERR3961118	29,558,847	8.9G	2.7Gb	2020-03-03
ERR3961119	38,322,085	11.5G	3.4Gb	2020-03-03
ERR3961120	37,937,639	11.4G	3.5Gb	2020-03-03
ERR3961121	28,784,232	8.6G	2.6Gb	2020-03-03
ERR3961122	29,743,103	8.9G	2.7Gb	2020-03-03
ERR4984073	38,477,614	11.5G	3.4Gb	2020-12-16
ERR4984074	37,613,737	11.3G	3.4Gb	2020-12-16
There are 10 omitted runs. See all runs in Run Selector.

ID:: 7513407

SRA

Sequence Read Archive

Result Filters

Send to:

Supplemental Content

Related information

Recent activity